Skip to content

fix(FR-2870): catch storage host fetch errors in session launcher#7365

Open
ironAiken2 wants to merge 1 commit into
mainfrom
05-12-fix_fr-2870_allow_session_creation_when_storage_list_fetch_fails
Open

fix(FR-2870): catch storage host fetch errors in session launcher#7365
ironAiken2 wants to merge 1 commit into
mainfrom
05-12-fix_fr-2870_allow_session_creation_when_storage_list_fetch_fails

Conversation

@ironAiken2
Copy link
Copy Markdown
Contributor

@ironAiken2 ironAiken2 commented May 12, 2026

Resolves #7364(FR-2870)

Summary

When the manager's vfolder host info fetch (/folders/_/hosts) fails (storage proxy 500, manager unreachable, etc.), the Session Launcher previously surfaced a generic page-level error from BAIErrorBoundary ("An error has occurred.") with a reload button — the user could not tell what was actually wrong and reloading didn't help.

This PR replaces that experience with a dedicated, recognizable error UI at the exact existing fetch site:

  • useProjectResourceGroups (packages/backend.ai-ui/src/hooks/useProjectResourceGroups.ts) is the live host-fetch path on the Session Launcher page — it powers BAIProjectResourceGroupSelect, which the launcher renders inside ResourceAllocationFormItems. It already runs /scaling-groups and /folders/_/hosts in parallel and the useSuspenseTanQuery underneath throws on any failure; before this change both failures were indistinguishable.
  • Switch the parallel fetch from Promise.all to Promise.allSettled, then discriminate: a host-info failure is re-thrown as a tagged StorageHostFetchError, while a scaling-groups failure is re-thrown as-is and bubbles to the generic boundary. Host-info failure takes precedence when both fail because SFTP filtering depends on it.
  • Wrap the /session/start route with a new StorageHostFetchErrorBoundary whose fallback matches the BAIErrorBoundary look (<Result status="warning">) but shows the message "Failed to fetch storage host information." (KO: "스토리지 호스트 정보를 가져오는데 실패했습니다.") and a "Go back to the previous page" primary button that calls history.back(). Any non-StorageHostFetchError is re-thrown so the outer BAIErrorBoundary continues to render its generic UI for unrelated failures.

The useCurrentProject atom's Promise.allSettled path (which silently swallows host failure into vhostInfo = undefined) is intentionally out of scope for this PR.

Implementation

  • packages/backend.ai-ui/src/hooks/useProjectResourceGroups.ts: introduce StorageHostFetchError extends Error; rewrite the queryFn to await Promise.allSettled, throw StorageHostFetchError(hostsResult.reason) when the host fetch rejects, re-throw the scaling-groups rejection otherwise. The success return shape is unchanged.
  • packages/backend.ai-ui/src/hooks/index.ts: export the new StorageHostFetchError class alongside the existing useProjectResourceGroups export.
  • react/src/components/StorageHostFetchErrorBoundary.tsx (new): wraps children with react-error-boundary's ErrorBoundary. The fallback imports StorageHostFetchError from backend.ai-ui, checks error instanceof StorageHostFetchError, and either renders the dedicated Result UI or re-throws so the outer BAIErrorBoundary handles unrelated errors.
  • react/src/routes.tsx: wraps <SessionLauncherPage /> at the /session/start route with StorageHostFetchErrorBoundary so the dedicated UI replaces the page body while the breadcrumb and outer layout remain intact.
  • resources/i18n/en.json, resources/i18n/ko.json: add errorBoundary.StorageHostFetchFailedTitle and button.GoBackToPreviousPage. Other locales fall back to English (fallbackLng: 'en'); /fw:i18n can fill them in.

How to test

  1. Force the storage host fetch to fail (e.g. kill the storage proxy or break the manager's /folders/_/hosts route).
  2. Open the Session Launcher at /session/start.
  3. Expected: the page body renders a warning Result with the title "Failed to fetch storage host information." and a "Go back to the previous page" button. The breadcrumb above remains visible.
  4. Click the button — the browser navigates back via history.back().
  5. Trigger an unrelated error elsewhere on the launcher (or break only /scaling-groups) — the existing BAIErrorBoundary behavior (page reload prompt, debug info under globalThis.backendaiwebui.debug) is unchanged.

Verification

bash scripts/verify.sh:

--- Relay: PASS ---
--- Lint: PASS ---
--- Format: PASS ---
--- TypeScript: FAIL ---  (pre-existing, unrelated files only — identical on main)

The TypeScript failures are pre-existing errors on main (StatisticsPage.tsx, StorageHostSettingPage.tsx, UserCredentialsPage.tsx, UserSettingsPage.tsx, VFolderNodeListPage.tsx, plus the pre-existing ErrorBoundary JSX type error at SessionLauncherPage.tsx:1433) — none introduced by this change.

Follow-ups

  • Translate the new i18n keys for the remaining 19 locales via /fw:i18n.
  • Address the silently-swallowed host failure in useCurrentProject's resourceGroupsForCurrentProjectAtom (Promise.allSettledvhostInfo = undefined masks the failure and produces an empty nonSftpResourceGroups). Deliberately out of scope for this PR.

Copilot AI review requested due to automatic review settings May 12, 2026 06:47
@github-actions github-actions Bot added area:ux UI / UX issue. area:i18n Localization size:M 30~100 LoC labels May 12, 2026
Copy link
Copy Markdown
Contributor Author


How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • flow:merge-queue - adds this PR to the back of the merge queue
  • flow:hotfix - for urgent changes, fast-track this PR to the front of the merge queue

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has required the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 12, 2026

Coverage Report for react-coverage (./react)

Status Category Percentage Covered / Total
🔵 Lines 6.44% 1783 / 27657
🔵 Statements 5.3% 1978 / 37312
🔵 Functions 5.17% 296 / 5717
🔵 Branches 3.71% 1293 / 34821
File Coverage
File Stmts Branches Functions Lines Uncovered Lines
Changed Files
react/src/routes.tsx 0% 0% 0% 0% 41-1007
react/src/components/StorageHostFetchErrorBoundary.tsx 0% 0% 0% 0% 16-40
Generated in workflow #751 for commit 23e6048 by the Vitest Coverage Report Action

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the session launcher’s storage (folder mounting) step so that a failure to load the storage/folder list no longer blocks session creation, by catching the error and presenting a warning fallback UI.

Changes:

  • Add new i18n strings for a “folder list unavailable” warning.
  • Wrap the storage-step VFolderTableFormItem with an ErrorBoundary that renders a warning Alert fallback when folder loading fails.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
resources/i18n/en.json Adds English strings for the “folder list unavailable” warning shown in the session launcher.
react/src/pages/SessionLauncherPage.tsx Wraps the storage-step folder mount table in an error boundary and shows a warning alert fallback on failure.

Comment thread resources/i18n/en.json Outdated
Comment thread react/src/pages/SessionLauncherPage.tsx Outdated
Comment thread react/src/pages/SessionLauncherPage.tsx Outdated
Comment thread react/src/pages/SessionLauncherPage.tsx Outdated
@ironAiken2 ironAiken2 force-pushed the 05-12-fix_fr-2870_allow_session_creation_when_storage_list_fetch_fails branch 2 times, most recently from 0b38f85 to a0ca791 Compare May 12, 2026 08:03
@github-actions github-actions Bot added size:L 100~500 LoC and removed size:M 30~100 LoC labels May 12, 2026
@ironAiken2 ironAiken2 force-pushed the 05-12-fix_fr-2870_allow_session_creation_when_storage_list_fetch_fails branch from a0ca791 to 1a7ecd7 Compare May 14, 2026 09:43
@ironAiken2 ironAiken2 changed the title fix(FR-2870): allow session creation when storage list fetch fails fix(FR-2870): catch storage host fetch errors in session launcher May 14, 2026
@ironAiken2 ironAiken2 force-pushed the 05-12-fix_fr-2870_allow_session_creation_when_storage_list_fetch_fails branch from 1a7ecd7 to 638b87d Compare May 15, 2026 05:31
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 15, 2026

Coverage Report for backend-ai-ui-coverage (./packages/backend.ai-ui)

Status Category Percentage Covered / Total
🔵 Lines 8.01% 362 / 4515
🔵 Statements 7.16% 411 / 5736
🔵 Functions 8.92% 94 / 1053
🔵 Branches 6.35% 362 / 5694
File Coverage
File Stmts Branches Functions Lines Uncovered Lines
Changed Files
packages/backend.ai-ui/src/hooks/index.ts 0% 0% 0% 0% 32-145
packages/backend.ai-ui/src/hooks/useProjectResourceGroups.ts 0% 0% 0% 0% 20-122
Generated in workflow #823 for commit 40ec8a3 by the Vitest Coverage Report Action

@ironAiken2 ironAiken2 force-pushed the 05-12-fix_fr-2870_allow_session_creation_when_storage_list_fetch_fails branch from 638b87d to 23e6048 Compare May 15, 2026 05:50
Comment thread packages/backend.ai-ui/src/hooks/useProjectResourceGroups.ts Outdated
Comment thread react/src/components/StorageHostFetchErrorBoundary.tsx
Copy link
Copy Markdown
Contributor

@nowgnuesLee nowgnuesLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ironAiken2 ironAiken2 force-pushed the 05-12-fix_fr-2870_allow_session_creation_when_storage_list_fetch_fails branch from 23e6048 to 40ec8a3 Compare May 18, 2026 08:37
@ironAiken2 ironAiken2 requested a review from agatha197 May 18, 2026 08:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:i18n Localization area:ux UI / UX issue. size:L 100~500 LoC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow session creation when storage has errors and surface mount failure at folder-mount step

4 participants